Skip to content
This repository has been archived by the owner on Feb 22, 2023. It is now read-only.

Add User-Agent header to all outbound requests coming from the API #877

Merged
merged 15 commits into from
Aug 17, 2022

Conversation

sarayourfriend
Copy link
Contributor

@sarayourfriend sarayourfriend commented Aug 11, 2022

Fixes

Fixes #826 by @sarayourfriend

Description

The fix should be applied to all outbound requests that could make a request to Wikimedia, not just in the Watermark endpoint. This is all of our outbound requests, so I've had to update all our outbound requests for it.

I tried to explore some ways of centralising this configuration, primarily by subclasses requests.Session and forcing a purpose kwarg to be present on all calls to ::request that would then configure the UA header, but it seemed heavy and I wasn't sure exactly how it would work or if folks would like that. I think it would make things a little easier to manage, or at least easier to not make mistakes with. But we also use grequests and urlopen which would still need their own bespoke solution... so I opted for the simplest approach across the board for now. We can improve this/generalise/abstract on it later if we want.

Testing Instructions

Run the app locally and ensure there are no issues.

Check out the unit tests. The ones involving urlopen were a complete nightmare to write!

Checklist

  • My pull request has a descriptive title (not a vague title like Update index.md).
  • My pull request targets the default branch of the repository (main) or a parent feature branch.
  • My commit messages follow best practices.
  • My code follows the established code style of the repository.
  • I added or updated tests for the changes I made (if applicable).
  • I added or updated documentation (if applicable).
  • I tried running the project locally and verified that there are no visible errors.

Developer Certificate of Origin

Developer Certificate of Origin
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

@openverse-bot openverse-bot added 💻 aspect: code Concerns the software code in the repository 🛠 goal: fix Bug fix 🟧 priority: high Stalls work on the project or its dependents labels Aug 11, 2022
@github-actions
Copy link

github-actions bot commented Aug 11, 2022

API Developer Docs Preview: Ready

https://wordpress.github.io/openverse-api/_preview/877

Please note that GitHub pages takes a little time to deploy newly pushed code, if the links above don't work or you see old versions, wait 5 minutes and try again.

You can check the GitHub pages deployment action list to see the current status of the deployments.

@@ -18,7 +18,7 @@ services:
environment:
PORT: 8222
MALLOC_ARENA_MAX: 2
command: ["-enable-url-source"]
command: ["-enable-url-source -forward-headers User-Agent"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new option will need to be ported up to the infrastructure repository as well, if it is merged.

Copy link
Member

@dhruvkb dhruvkb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a reasonable configuration for now. The new env var you've added OUTBOUND_USER_AGENT_TEMPLATE should also be added to the env.template file (if only as a comment because the default value is good enough).

I agree that in the future we can use a central solution to manage requests but that will require all HTTP requests to first use the same libraries which will take a significant amount of time.

@sarayourfriend sarayourfriend force-pushed the fix/watermark-useragent branch from a1a3e53 to 2ac52a4 Compare August 12, 2022 16:41
@sarayourfriend sarayourfriend changed the base branch from main to add/better-status-code-thumbnail-rendering August 12, 2022 16:41
@sarayourfriend sarayourfriend marked this pull request as ready for review August 12, 2022 16:41
@sarayourfriend sarayourfriend requested a review from a team as a code owner August 12, 2022 16:41
@sarayourfriend sarayourfriend requested review from AetherUnbound and obulat and removed request for a team August 12, 2022 16:41
Copy link
Contributor

@AetherUnbound AetherUnbound left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you for adding thorough tests too. Agreed that a more centralized implementation would be cool but the variety of request methods we have precludes that. I think the "purpose" interpolation you've added is excellent 💯

I have a few comments around the tests but the operational code is ✔️ IMO

api/test/unit/utils/watermark_test.py Show resolved Hide resolved

assert len(grequests.requests) > 0
for r in grequests.requests:
assert HEADERS == r.kwargs["headers"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: All the other tests use the format assert actual == expected whereas this one seems reversed.



@pytest.fixture(autouse=True)
def requests(monkeypatch) -> RequestsFixture:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought, I notice that some of these fixtures differ slightly but a few of them (this one and the same fixture in watermark_test.py). Do you think it'd be possible to move some of these up to a common conftest.py so they only need to be defined once? Particularly with fixtures as simple as api_client.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather do this refactor in another PR as part of another issue that more careful looks at the state of our unit test fixtures in general and thoughtfully establishes a pattern. For now most of these have subtle differences that would be tedious to abstract.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good 🙂

@sarayourfriend
Copy link
Contributor Author

Blocked by #875

Base automatically changed from add/better-status-code-thumbnail-rendering to main August 16, 2022 17:08
@sarayourfriend sarayourfriend force-pushed the fix/watermark-useragent branch from bad1118 to e5b4c8b Compare August 16, 2022 17:09
@sarayourfriend sarayourfriend removed the ⛔ status: blocked Blocked & therefore, not ready for work label Aug 16, 2022
Copy link
Contributor

@AetherUnbound AetherUnbound left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New changes look good 👍🏼

@sarayourfriend sarayourfriend merged commit f0c44d5 into main Aug 17, 2022
@sarayourfriend sarayourfriend deleted the fix/watermark-useragent branch August 17, 2022 17:34
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
💻 aspect: code Concerns the software code in the repository 🛠 goal: fix Bug fix 🟧 priority: high Stalls work on the project or its dependents
Projects
None yet
Development

Successfully merging this pull request may close these issues.

User-Agent setting needed for Wikimedia watermarked images
4 participants